Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid re-assigning the global pid for client backends and bg workers when the application_name changes #7791

Merged
merged 12 commits into from
Dec 23, 2024

Conversation

onurctirtir
Copy link
Member

@onurctirtir onurctirtir commented Dec 17, 2024

DESCRIPTION: Fixes a crash that happens because of unsafe catalog access when re-assigning the global pid after application_name changes.

When application_name changes, we don't actually need to
try re-assigning the global pid for external client backends because
application_name doesn't affect the global pid for such backends. Plus,
trying to re-assign the global pid for external client backends would
unnecessarily cause performing a catalog access when the cached local
node id is invalidated. However, accessing to the catalog tables is
dangerous in certain situations like when we're not in a transaction
block. And for the other types of backends, i.e., the Citus internal
backends, we need to re-assign the global pid when the application_name
changes because for such backends we simply extract the global pid
inherited from the originating backend from the application_name -that's
specified by originating backend when openning that connection- and this
doesn't require catalog access.

Copy link

codecov bot commented Dec 17, 2024

Codecov Report

Attention: Patch coverage is 71.42857% with 2 lines in your changes missing coverage. Please review.

Project coverage is 89.70%. Comparing base (665d72a) to head (02d9d20).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7791      +/-   ##
==========================================
- Coverage   89.70%   89.70%   -0.01%     
==========================================
  Files         283      283              
  Lines       60509    60516       +7     
  Branches     7541     7542       +1     
==========================================
+ Hits        54278    54284       +6     
+ Misses       4076     4075       -1     
- Partials     2155     2157       +2     

@onurctirtir
Copy link
Member Author

onurctirtir commented Dec 18, 2024

AssignGlobalPID() deals with two kinds of backends:

  1. External client backends
  2. Others, i.e., Citus internal backends.

And we may need to perform catalog access for the class-1, not for class-2. This is because, for class-1, we make use of the local node id to actually generate a global pid (GenerateGlobalPID()). However for class-2, we simply extract the global pid inherited from the originating backend from the application_name specified by originating backend when openning that connection and extracting the global pid from application_name doesn't require catalog access at all (ExtractGlobalPID()).

And the nice part is that for class-I, the global pid doesn't change when application_name changes. For this reason, rather than such a code change that this PR proposed earlier, only taking an action for class-2 looks like a more appropriate fix.

@onurctirtir onurctirtir changed the title Avoid re-assigning the global pid when we cannot retrieve local node id without unsafe catalog access Avoid re-assigning the global pid for client backends when the application_name changes Dec 18, 2024
@onurctirtir onurctirtir marked this pull request as ready for review December 18, 2024 11:38
@onurctirtir onurctirtir force-pushed the fix-unsafe-reassign-global-pid branch from a870bca to 9f2f0e3 Compare December 18, 2024 12:00
@onurctirtir onurctirtir force-pushed the fix-unsafe-reassign-global-pid branch from 9f2f0e3 to 1bd0ed8 Compare December 18, 2024 13:38
@onurctirtir
Copy link
Member Author

onurctirtir commented Dec 18, 2024

AssignGlobalPID() deals with two kinds of backends:

  1. External client backends
  2. Others, i.e., Citus internal backends.

And we may need to perform catalog access for the class-1, not for class-2. This is because, for class-1, we make use of the local node id to actually generate a global pid (GenerateGlobalPID()). However for class-2, we simply extract the global pid inherited from the originating backend from the application_name specified by originating backend when openning that connection and extracting the global pid from application_name doesn't require catalog access at all (ExtractGlobalPID()).

And the nice part is that for class-I, the global pid doesn't change when application_name changes. For this reason, rather than such a code change that this PR proposed earlier, only taking an action for class-2 looks like a more appropriate fix.

Ok, so looks like, we rely on ApplicationNameAssignHook to assign global pid for external client backends as well even though such cases are quite rare, so I need to revert back to the initial design.

…e application_name changes"

This reverts commit 1bd0ed8.
…id without unsafe catalog access

(cherry picked from commit d06c0cd)
@onurctirtir onurctirtir changed the title Avoid re-assigning the global pid for client backends when the application_name changes Avoid re-assigning the global pid when we cannot retrieve local node id without unsafe catalog access Dec 18, 2024
@onurctirtir onurctirtir changed the title Avoid re-assigning the global pid when we cannot retrieve local node id without unsafe catalog access Avoid re-assigning the global pid for client backends and bg workers when the application_name changes Dec 20, 2024
@onurctirtir
Copy link
Member Author

AssignGlobalPID() deals with two kinds of backends:

  1. External client backends
  2. Others, i.e., Citus internal backends.

And we may need to perform catalog access for the class-1, not for class-2. This is because, for class-1, we make use of the local node id to actually generate a global pid (GenerateGlobalPID()). However for class-2, we simply extract the global pid inherited from the originating backend from the application_name specified by originating backend when openning that connection and extracting the global pid from application_name doesn't require catalog access at all (ExtractGlobalPID()).
And the nice part is that for class-I, the global pid doesn't change when application_name changes. For this reason, rather than such a code change that this PR proposed earlier, only taking an action for class-2 looks like a more appropriate fix.

Ok, so looks like, we rely on ApplicationNameAssignHook to assign global pid for external client backends as well even though such cases are quite rare, so I need to revert back to the initial design.

Fixed that failing test too, so decided to move forward with the alternative approach documented again.

@emelsimsek
Copy link
Contributor

LGTM.

@emelsimsek emelsimsek self-requested a review December 23, 2024 08:53
@onurctirtir onurctirtir enabled auto-merge (squash) December 23, 2024 13:46
@onurctirtir onurctirtir merged commit 7341191 into main Dec 23, 2024
151 of 154 checks passed
@onurctirtir onurctirtir deleted the fix-unsafe-reassign-global-pid branch December 23, 2024 14:01
Copy link

@codeforall codeforall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

onurctirtir added a commit that referenced this pull request Dec 24, 2024
…when the application_name changes (#7791)

DESCRIPTION: Fixes a crash that happens because of unsafe catalog access
when re-assigning the global pid after application_name changes.

When application_name changes, we don't actually need to
try re-assigning the global pid for external client backends because
application_name doesn't affect the global pid for such backends. Plus,
trying to re-assign the global pid for external client backends would
unnecessarily cause performing a catalog access when the cached local
node id is invalidated. However, accessing to the catalog tables is
dangerous in certain situations like when we're not in a transaction
block. And for the other types of backends, i.e., the Citus internal
backends, we need to re-assign the global pid when the application_name
changes because for such backends we simply extract the global pid
inherited from the originating backend from the application_name -that's
specified by originating backend when openning that connection- and this
doesn't require catalog access.

(cherry picked from commit 7341191)
naisila pushed a commit that referenced this pull request Jan 13, 2025
…when the application_name changes (#7791)

DESCRIPTION: Fixes a crash that happens because of unsafe catalog access
when re-assigning the global pid after application_name changes.

When application_name changes, we don't actually need to
try re-assigning the global pid for external client backends because
application_name doesn't affect the global pid for such backends. Plus,
trying to re-assign the global pid for external client backends would
unnecessarily cause performing a catalog access when the cached local
node id is invalidated. However, accessing to the catalog tables is
dangerous in certain situations like when we're not in a transaction
block. And for the other types of backends, i.e., the Citus internal
backends, we need to re-assign the global pid when the application_name
changes because for such backends we simply extract the global pid
inherited from the originating backend from the application_name -that's
specified by originating backend when openning that connection- and this
doesn't require catalog access.

(cherry picked from commit 7341191)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants